library(knitr)
opts_knit$set(root.dir = "/Users/Morgan/Documents/Research/McMaster/Modeling_Work/Mike_Lunchbox/My_edits/lunchbox")
options(mc.cores = parallel::detectCores())
#opts_chunk$set(cache = TRUE)
NIMBLE is built in R but compiles your models and algorithms using C++ for speed
NIMBLE is most commonly used for MCMC but can also be used to implement a series of other algorithms (e.g. particle filtering, MCEM)
1. A system for writing statistical models flexibly, which is an extension of the BUGS language
2. A library of algorithms such as MCMC.
3. A language, called NIMBLE, embedded within and similar in style to R, for writing algorithms that operate on BUGS models.
One of the most important concepts behind NIMBLE is to allow a combination of highlevel processing in R and low-level processing in compiled C++.
On Windows, you should download and install Rtools.exe available from http://cran. r-project.org/bin/windows/Rtools/.
On OS X, you should install Xcode.
After these are installed you can install NIMBLE in R using
install.packages(“nimble”, repos = “http://r-nimble.org”, type = “source”)
Please post about installation problems to the nimble-users Google group or email nimble.stats@gmail.com.
You will also need to download STAN using the following commands
Sys.setenv(MAKEFLAGS = “-j4”)
install.packages(“rstan”, dependencies = TRUE)
In total you will need the following pakages:
library("nimble")
library("R2jags")
library("ggplot2")
library("nimble")
library("rstan")
library("igraph")
library("parallel")
library("mcmcplots")
library("lattice")
library("coda")
Programming in NIMBLE involves a fundamental distinction between:
1. the steps for an algorithm that need to happen only once, at the beginning, such as inspecting the model
2. the steps that need to happen each time a function is called, such as MCMC iterations.
When one writes a nimbleFunction, each of these parts can be provided separately.
Multiple parameterizations for distributions, similar to those in R, can be used. NIMBLE calls non-stochastic nodes “deterministic”, whereas BUGS calls them “logical”. NIMBLE uses “logical” in the way R does, to refer to boolean (TRUE/FALSE) variables.
Alternative models can be defined from the same model code by using if-then-else statements that are evaluated when the model is defined.
The general outline for this presentation follows along with the NIMBLE users manual
http://r-nimble.org/documentation-2
However, the model(s) used here are written by us
1.1 Build a chain binomial model in JAGS. Conduct parameter estimation
1.2 Translate the model into NIBLE. Conduct parameter estimation
1.2.1 Model exploration/conversion
1.2.2 Create a basic MCMC specification for the chain binomial, compile and run the MCMC
1.2.3 Small MCMC specification adjustments (more on this in Part 3)
1.3 Compare the JAGS and NIMBLE results (parameter estimates, uncertainty, convergence, efficiency)
2.1 Translate the model using a “hybrid approach” (STAN does not allow for discrete latent variables)
2.1.1 Conduct parameter estimation using JAGS and NIMBLE
2.1.2 Run the hybrid model in STAN and compare the results from JAGS, NIMBLE and STAN
2.2 Compare the NIMBLE Chain Binomial and STAN hybrid model
3.1 Expolore more fine-tuned adjustments that can be made in NIMBLE
3.1.1 NIMBLE functions (e.g. allows for the implementation of custom samplers)
4.1 NIMBLE extras:
4.1.1 Create, compile and run a Monte Carlo Expectation Maximization (MCEM) algorithm, which illustrates some of the flexibility NIMBLE provides to combine R and NIMBLE.
4.1.2 Implement particle filtering for the chain binomial
5.1 Misc NIMBLE notes (truncated distributions, lifted nodes, logProb)
First step is to construct the simulator from which we will obtain our data
Note: It will be important to set your current working directory to “../stat744/notes/NIMBLE”
Set parameters and load the Chain Binomial simulator
beta <- 0.02
pop <- 100
effpropS <- 0.8
effpropI <- 0.2
reporting <- 0.5
s0 <- effpropS*pop
r0 <- 0
zerohack <- 0.001
numobs <- 12
nimtimevec <- c()
source("CBsimulator.R")
sim <- simCB(beta = beta, pop = pop, effpropS = effpropS, effpropI = effpropI,
t0 = 1, numobs = numobs, reporting = reporting, seed = 3)
sim
## time S I R Iobs
## 1 1 80 5 0 1
## 2 2 70 10 5 5
## 3 3 59 11 15 6
## 4 4 47 12 26 4
## 5 5 38 9 38 5
## 6 6 31 7 47 4
## 7 7 27 4 54 2
## 8 8 25 2 58 2
## 9 9 23 2 60 0
## 10 10 22 1 62 1
## 11 11 22 0 63 0
## 12 12 22 0 63 0
Take a peek at what this model produces
Set up the required arguments to run the JAGS model
data <- list(obs = sim$Iobs, pop = pop, numobs = nrow(sim), r0 = r0)
inits <- list(list(I = sim$I * 1, effpropS = effpropS - 0.1, effpropI = effpropI -
0.15, beta = beta + 0.05, reporting = reporting), list(I = sim$I * 1 + 1,
effpropS = effpropS - 0.1, effpropI = effpropI + 0.2, beta = beta + 0.1,
reporting = reporting), list(I = sim$I * 1 + 2, effpropS = effpropS, effpropI = effpropI,
beta = beta - 0.008, reporting = reporting))
params = c("beta", "effpropS", "effpropI", "reporting")
# rjags::set.factory('bugs::Conjugate', FALSE, type='sampler')
Create the model and examine the MCMC algorithms that JAGS will use to sample
cbjagsmodel <- jags.model(data = data,
inits = inits,
file = "CB.bug",
n.chains = length(inits))
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph Size: 84
##
## Initializing model
list.samplers(cbjagsmodel)
## $ShiftedCount
## [1] "I[12]"
##
## $DiscreteSlicer
## [1] "I[11]"
##
## $DiscreteSlicer
## [1] "I[10]"
##
## $DiscreteSlicer
## [1] "I[9]"
##
## $DiscreteSlicer
## [1] "I[8]"
##
## $DiscreteSlicer
## [1] "I[7]"
##
## $DiscreteSlicer
## [1] "I[6]"
##
## $DiscreteSlicer
## [1] "I[5]"
##
## $DiscreteSlicer
## [1] "I[4]"
##
## $DiscreteSlicer
## [1] "I[3]"
##
## $DiscreteSlicer
## [1] "I[2]"
##
## $DiscreteSlicer
## [1] "I[1]"
##
## $DiscreteSlicer
## [1] "S[1]"
##
## $RealSlicer
## [1] "beta"
##
## $ConjugateBeta
## [1] "effpropI"
##
## $ConjugateBeta
## [1] "effpropS"
##
## $ConjugateBeta
## [1] "reporting"
Run some chains (could use coda::coda.samples from cbjagsmodel but here we will just run jags())
jagstime <- system.time(cbjags <- jags(data = data,
inits = inits,
param = params,
model.file = "CB.bug",
n.iter = 11000,
n.burnin = 1000,
n.thin = 20,
n.chains = length(inits)))
## module glm loaded
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph Size: 84
##
## Initializing model
cbjags
## Inference for Bugs model at "CB.bug", fit using jags,
## 3 chains, each with 11000 iterations (first 1000 discarded), n.thin = 20
## n.sims = 1500 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## beta 0.022 0.007 0.012 0.017 0.020 0.025 0.039 1.002 1200
## effpropI 0.391 0.270 0.035 0.155 0.331 0.589 0.943 1.003 730
## effpropS 0.820 0.128 0.506 0.753 0.855 0.918 0.974 1.006 430
## reporting 0.481 0.119 0.298 0.391 0.464 0.553 0.762 1.005 370
## deviance 29.965 3.350 23.533 27.913 29.820 32.034 37.025 1.008 290
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
##
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 5.6 and DIC = 35.5
## DIC is an estimate of expected predictive error (lower deviance is better).
xyplot(as.mcmc(cbjags))
source('nimCB.R')
Set up the model. Here we need: Constants, Data, Initial Values, NIMBLE model object
nimCBdata <- list(obs = sim$Iobs)
nimCBcon <- list(numobs = numobs, pop = pop, r0 = r0)
nimCBinits <- list(I = sim$I,
effpropS = effpropS,
effpropI = effpropI,
beta = beta,
reporting = reporting,
s0 = s0)
nimtimevec[1] <- system.time(CBout <- nimbleModel(code = nimcode,
name = 'CBout',
constants = nimCBcon,
data = nimCBdata,
inits = nimCBinits))[3]
## defining model...
## building model...
## setting data and initial values...
## checking model... (use nimbleModel(..., check = FALSE) to skip model check)
## model building finished
CBout$getNodeNames()
## [1] "reporting" "effpropS" "effpropI"
## [4] "beta" "R[1]" "s0"
## [7] "lifted_d100_minus_s0" "S[1]" "I[1]"
## [10] "pSI[1]" "obs[1]" "R[2]"
## [13] "I[2]" "S[2]" "R[3]"
## [16] "pSI[2]" "obs[2]" "I[3]"
## [19] "S[3]" "R[4]" "pSI[3]"
## [22] "obs[3]" "I[4]" "S[4]"
## [25] "R[5]" "pSI[4]" "obs[4]"
## [28] "I[5]" "S[5]" "R[6]"
## [31] "pSI[5]" "obs[5]" "I[6]"
## [34] "S[6]" "R[7]" "pSI[6]"
## [37] "obs[6]" "I[7]" "S[7]"
## [40] "R[8]" "pSI[7]" "obs[7]"
## [43] "I[8]" "S[8]" "R[9]"
## [46] "pSI[8]" "obs[8]" "I[9]"
## [49] "S[9]" "R[10]" "pSI[9]"
## [52] "obs[9]" "I[10]" "S[10]"
## [55] "R[11]" "pSI[10]" "obs[10]"
## [58] "I[11]" "S[11]" "R[12]"
## [61] "pSI[11]" "obs[11]" "I[12]"
## [64] "S[12]" "pSI[12]" "obs[12]"
CBout$obs
## [1] 1 5 6 4 5 4 2 2 0 1 0 0
par(mfrow = c(1,1))
plot(CBout$graph)
nimbleModel does its best to initialize a model, but let’s say you want to re-initialize I.
simulate(CBout, 'I')
CBout$I
## [1] 3 6 15 16 10 6 3 1 2 0 0 0
CBout$getDependencies(c("beta"))
## [1] "beta" "pSI[1]" "I[2]" "pSI[2]" "I[3]" "pSI[3]" "I[4]"
## [8] "pSI[4]" "I[5]" "pSI[5]" "I[6]" "pSI[6]" "I[7]" "pSI[7]"
## [15] "I[8]" "pSI[8]" "I[9]" "pSI[9]" "I[10]" "pSI[10]" "I[11]"
## [22] "pSI[11]" "I[12]" "pSI[12]"
CBout$getDependencies(c("beta"), determOnly = TRUE)
## [1] "pSI[1]" "pSI[2]" "pSI[3]" "pSI[4]" "pSI[5]" "pSI[6]" "pSI[7]"
## [8] "pSI[8]" "pSI[9]" "pSI[10]" "pSI[11]" "pSI[12]"
nimtimevec[2] <- system.time(CBoutC <- compileNimble(CBout))[3]
Configure the MCMC with the default options (we will return to customizing this setup later)
nimtimevec[3] <- system.time(CBoutSpec <- configureMCMC(CBout, print = TRUE))[3]
## [1] RW sampler: reporting, adaptive: TRUE, adaptInterval: 200, scale: 1
## [2] RW sampler: effpropS, adaptive: TRUE, adaptInterval: 200, scale: 1
## [3] RW sampler: effpropI, adaptive: TRUE, adaptInterval: 200, scale: 1
## [4] RW sampler: beta, adaptive: TRUE, adaptInterval: 200, scale: 1
## [5] slice sampler: s0, adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [6] slice sampler: I[1], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [7] slice sampler: I[2], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [8] slice sampler: I[3], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [9] slice sampler: I[4], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [10] slice sampler: I[5], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [11] slice sampler: I[6], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [12] slice sampler: I[7], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [13] slice sampler: I[8], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [14] slice sampler: I[9], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [15] slice sampler: I[10], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [16] slice sampler: I[11], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [17] slice sampler: I[12], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
Add chain monitors for the parameters of interest and add thinning
CBoutSpec$addMonitors(c("beta", "effpropS", "effpropI", "reporting"))
## thin = 1: reporting, effpropS, effpropI, beta
CBoutSpec$setThin(20)
## thin = 20: reporting, effpropS, effpropI, beta
Build the MCMC
nimtimevec[4] <- system.time(CBoutMCMC <- buildMCMC(CBoutSpec))[3]
nimtimevec[5] <- system.time(CBoutMCMC <- compileNimble(CBoutMCMC, project = CBout, resetFunctions = TRUE))[3]
niter <- 11000
set.seed(0)
nimtimevec[6] <- system.time(CBoutMCMC$run(niter))[3]
Quick peek at time required. Below we will take a look at efficiency using a convenient coda function
Gross time required
jagstime[3]
## elapsed
## 12.015
sum(nimtimevec[1:6], na.rm = TRUE)
## [1] 16.58
nimtimevec[6]
## [1] 0.94
Efficiency (Net time in a sense)
samples <- as.matrix(CBoutMCMC$mvSamples)
head(samples)
## beta effpropI effpropS reporting
## [1,] 0.02 0.20731932 0.8427785 0.5000000
## [2,] 0.02 0.05087168 0.7887759 0.6916757
## [3,] 0.02 0.28765490 0.7887759 0.5316554
## [4,] 0.02 0.14800198 0.7111727 0.6957240
## [5,] 0.02 0.30758335 0.7386033 0.6028016
## [6,] 0.02 0.33371754 0.8991708 0.4605610
jags_eff <- effectiveSize(as.mcmc.list(as.mcmc(cbjags))) / nimtimevec[1]
nim_eff <- effectiveSize(as.mcmc.list(as.mcmc(samples))) / nimtimevec[6]
jags_eff
## beta deviance effpropI effpropS reporting
## 439.3923 814.5146 529.4877 315.7712 492.6933
nim_eff
## beta effpropI effpropS reporting
## 80.84285 131.48813 71.82709 128.10869
Save these points for later
def_effS <- samples[ , 'effpropS']
def_effI <- samples[ , 'effpropI']
Look at the correlation in the chains
acf(samples[, "beta"])
acf(samples[, "reporting"])
acf(samples[, "effpropS"])
acf(samples[, "effpropI"])
A few undesirable results here… we can add a block sampler to decrease correlation
Take a look at the samplers being used
CBoutSpec$getSamplers()
## [1] RW sampler: reporting, adaptive: TRUE, adaptInterval: 200, scale: 1
## [2] RW sampler: effpropS, adaptive: TRUE, adaptInterval: 200, scale: 1
## [3] RW sampler: effpropI, adaptive: TRUE, adaptInterval: 200, scale: 1
## [4] RW sampler: beta, adaptive: TRUE, adaptInterval: 200, scale: 1
## [5] slice sampler: s0, adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [6] slice sampler: I[1], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [7] slice sampler: I[2], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [8] slice sampler: I[3], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [9] slice sampler: I[4], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [10] slice sampler: I[5], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [11] slice sampler: I[6], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [12] slice sampler: I[7], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [13] slice sampler: I[8], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [14] slice sampler: I[9], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [15] slice sampler: I[10], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [16] slice sampler: I[11], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [17] slice sampler: I[12], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
CBoutSpec$addSampler(target = c('effpropS', 'effpropI'), type = 'RW_block',
control = list(adaptInterval = 10000))
## [18] RW_block sampler: effpropS, effpropI, adaptive: TRUE, adaptScaleOnly: FALSE, adaptInterval: 10000, scale: 1, propCov: identity
CBoutSpec$setThin(30)
## thin = 30: reporting, effpropS, effpropI, beta
CBoutMCMC <- buildMCMC(CBoutSpec)
CBoutMCMC <- compileNimble(CBoutMCMC, project = CBout, resetFunctions = TRUE)
CBoutMCMC$run(30000)
## NULL
samplesNew <- as.matrix(CBoutMCMC$mvSamples)
Check for an imporvement
par(mfrow = c(2,2))
acf(samplesNew[, "effpropS"])
acf(samplesNew[, "effpropI"])
plot(samplesNew[ , 'effpropS'], type = 'l', xlab = 'iteration')
plot(samplesNew[ , 'effpropI'], type = 'l', xlab = 'iteration')
par(mfrow = c(1,1))
plot(samplesNew[ , 'effpropS'], samplesNew[ , 'effpropI'], pch = 20)
points(def_effS, def_effI, pch = 20, col = "blue")
Well that didn’t do anything…
NIMBLE allows for specification of samplers by parameter or node by node (NIMBLE included or user created)
CBout <- nimbleModel(code = nimcode,
name = 'CBout',
constants = nimCBcon,
data = nimCBdata,
inits = nimCBinits)
## defining model...
## building model...
## setting data and initial values...
## checking model... (use nimbleModel(..., check = FALSE) to skip model check)
## model building finished
CBoutC <- compileNimble(CBout)
CBoutSpec <- configureMCMC(CBout, print = TRUE)
## [1] RW sampler: reporting, adaptive: TRUE, adaptInterval: 200, scale: 1
## [2] RW sampler: effpropS, adaptive: TRUE, adaptInterval: 200, scale: 1
## [3] RW sampler: effpropI, adaptive: TRUE, adaptInterval: 200, scale: 1
## [4] RW sampler: beta, adaptive: TRUE, adaptInterval: 200, scale: 1
## [5] slice sampler: s0, adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [6] slice sampler: I[1], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [7] slice sampler: I[2], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [8] slice sampler: I[3], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [9] slice sampler: I[4], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [10] slice sampler: I[5], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [11] slice sampler: I[6], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [12] slice sampler: I[7], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [13] slice sampler: I[8], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [14] slice sampler: I[9], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [15] slice sampler: I[10], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [16] slice sampler: I[11], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [17] slice sampler: I[12], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
CBoutSpec$addMonitors(c("beta", "effpropS", "effpropI", "reporting"))
## thin = 1: reporting, effpropS, effpropI, beta
CBoutSpec$setThin(20)
## thin = 20: reporting, effpropS, effpropI, beta
CBoutSpec$removeSamplers(c("beta", "effpropS", "effpropI", "reporting"), print = TRUE)
## [1] slice sampler: s0, adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [2] slice sampler: I[1], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [3] slice sampler: I[2], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [4] slice sampler: I[3], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [5] slice sampler: I[4], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [6] slice sampler: I[5], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [7] slice sampler: I[6], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [8] slice sampler: I[7], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [9] slice sampler: I[8], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [10] slice sampler: I[9], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [11] slice sampler: I[10], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [12] slice sampler: I[11], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
## [13] slice sampler: I[12], adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
CBoutSpec$addSampler("beta", type = "slice", print = TRUE)
## [14] slice sampler: beta, adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
CBoutSpec$addSampler("effpropS", type = "slice", print = TRUE)
## [15] slice sampler: effpropS, adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
CBoutSpec$addSampler("effpropI", type = "slice", print = TRUE)
## [16] slice sampler: effpropI, adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
CBoutSpec$addSampler("reporting", type = "slice", print = TRUE)
## [17] slice sampler: reporting, adaptive: TRUE, adaptInterval: 200, sliceWidth: 1, sliceMaxSteps: 100
CBoutMCMC <- buildMCMC(CBoutSpec)
CBoutMCMC <- compileNimble(CBoutMCMC, project = CBout, resetFunctions = TRUE)
CBoutMCMC$run(30000)
## NULL
samplesNew <- as.matrix(CBoutMCMC$mvSamples)
par(mfrow = c(2,2))
acf(samplesNew[, "effpropS"])
acf(samplesNew[, "effpropI"])
plot(samplesNew[ , 'effpropS'], type = 'l', xlab = 'iteration')
plot(samplesNew[ , 'effpropI'], type = 'l', xlab = 'iteration')
par(mfrow = c(1,1))
plot(samplesNew[ , 'effpropS'], samplesNew[ , 'effpropI'], pch = 20)
points(def_effS, def_effI, pch = 20, col = "blue")
We can also compare the NIMBLE model simultaneously with the JAGS model using MCMCsuite()
Be warned: running this code will produce about 6-8 graphs which will all pop up in separate windows!
nimcb <- MCMCsuite(code = nimcode,
data = nimCBdata,
inits = nimCBinits,
constants = nimCBcon,
MCMCs = c("jags", "nimble"),
monitors = c("beta", "reporting", "effpropS", "effpropI"),
niter = 12000,
calculateEfficiency = TRUE,
makePlot = FALSE,
savePlot = FALSE)
## defining model...
## building model...
## setting data and initial values...
## checking model... (use nimbleModel(..., check = FALSE) to skip model check)
## model building finished
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph Size: 83
##
## Initializing model
nimcb$summary
## , , beta
##
## mean median sd CI95_low CI95_upp n
## jags 0.02211968 0.02044865 0.007749308 0.01228891 0.04245738 10000
## nimble 0.02238331 0.02026432 0.008585741 0.01212697 0.04593279 10000
## ess efficiency
## jags 195.3439 62.25109
## nimble 83.1905 79.07843
##
## , , reporting
##
## mean median sd CI95_low CI95_upp n ess
## jags 0.5178239 0.4943196 0.1396390 0.3072968 0.8401621 10000 269.0024
## nimble 0.5291161 0.5031207 0.1402128 0.3107321 0.8428592 10000 223.0523
## efficiency
## jags 85.72416
## nimble 212.02693
##
## , , effpropS
##
## mean median sd CI95_low CI95_upp n ess
## jags 0.7974094 0.8421026 0.1448468 0.4557955 0.9703215 10000 149.14588
## nimble 0.7917268 0.8326756 0.1502838 0.4371907 0.9729178 10000 52.59819
## efficiency
## jags 47.52896
## nimble 49.99827
##
## , , effpropI
##
## mean median sd CI95_low CI95_upp n ess
## jags 0.3706128 0.2992648 0.2725763 0.03221043 0.9491894 10000 234.6599
## nimble 0.3646788 0.2788137 0.2803558 0.02503840 0.9535620 10000 118.3843
## efficiency
## jags 74.78009
## nimble 112.53258
nimcb$efficiency
## $min
## jags nimble
## 47.52896 49.99827
##
## $mean
## jags nimble
## 67.57107 113.40905
nimcb$timing
## jags nimble nimble_compile
## 3.138 1.052 13.866
We must rewrite the model so that there are no discrete latent variables. We call this the “hybrid model”
An asside – Discrete Latent Variables:
An additional asside – Hamiltonian MCMC:
But before we fit the model in STAN lets explore the hybrid model in NIMBLE
NIMBLE allows us to compare the results of multiple models even if they have different parameterizations (e.g. Chain Binomial and the Hybrid Model)
data$obs <- data$obs + zerohack # Guarnantee that obs remains above 0 (important for the gamma)
data$zerohack <- zerohack
hybridjags <- jags(data = data,
inits = inits,
param = params,
model.file = "hybrid.bug",
n.iter = 8000,
n.chains = length(inits))
source('nimhybrid.R')
nimhydata <- list(obs = sim$Iobs + zerohack)
nimhycon <- list(numobs = numobs, pop = pop, r0 = r0, zerohack = zerohack)
nimhyinits <- list(I = sim$I + zerohack,
effpropS = effpropS,
effpropI = effpropI,
beta = beta,
reporting = reporting,
s0 = s0)
nimcb <- MCMCsuite(code = nimcode,
data = nimhydata,
inits = nimhyinits,
constants = nimhycon,
MCMCs = c("jags", "nimble"),
monitors = c("beta", "reporting", "effpropS", "effpropI"),
niter = 10000,
makePlot = FALSE,
savePlot = FALSE)
Run the STAN model
stantime <- system.time (s1 <- stan(file='hybrid.stan', data = data, init = inits,
pars=c("beta", "reporting", "effpropS", "effpropI", "I"), iter = 8000,
seed = 1001, chains = length(inits)))
Compare all three methods using the hybrid model
nimhydata <- list(obs = sim$Iobs + zerohack)
nimhycon <- list(numobs = numobs, pop = pop,
r0 = r0, zerohack = zerohack)
nimhyinits <- list(I = sim$I + zerohack,
effpropS = effpropS,
effpropI = effpropI,
beta = beta,
reporting = reporting,
s0 = s0)
allhybrids <- MCMCsuite(code = nimcode,
data = nimhydata,
inits = nimhyinits,
constants = nimhycon,
stan_model = "hybrid.stan",
MCMCs = c("jags", "nimble", "stan"),
monitors = c("beta", "reporting", "effpropS", "effpropI"),
niter = 10000,
makePlot = FALSE,
savePlot = FALSE)
nimCBdata <- list(obs = sim$Iobs)
nimCBcon <- list(numobs = numobs, pop = pop, r0 = r0, zerohack = zerohack)
nimCBinits <- list(I = sim$I,
effpropS = effpropS,
effpropI = effpropI,
beta = beta,
reporting = reporting,
s0 = s0)
nimcb <- MCMCsuite(code = nimcode,
data = nimCBdata,
inits = nimCBinits,
constants = nimCBcon,
stan_model = "hybrid.stan",
MCMCs = c("jags", "nimble", "stan"),
monitors = c("beta", "reporting", "effpropS", "effpropI"),
niter = 10000,
makePlot = TRUE,
savePlot = TRUE)
## the name of this sampler function, for the purposes of ## adding it to MCMC configurations, will be 'my_RW' my_RW <- nimbleFunction(
## sampler functions must contain 'sampler_BASE'
contains = sampler_BASE,
## sampler functions must have exactly these setup arguments: ## model, mvSaved, target, control
setup = function(model, mvSaved, target, control) {
## first, extract the control list elements, which will
## dictate the behavior of this sampler.
## the setup code will be later processed to determine
## all named elements extracted from the control list.
## these will become the required elements for any
## control list argument to this sampler, unless they also
## exist in the NIMBLE system option 'MCMCcontrolDefaultList'.
## the random walk proposal standard deviation
scale <- control$scale
## determine the list of all dependent nodes,
## up to the first layer of stochastic nodes, generally
## called 'calcNodes'. The values, inputs, and logProbs
## of these nodes will be retrieved and/or altered
## by this algorithm.
calcNodes <- model$getDependencies(target)
},
## the run function must accept no arguments, execute
## the sampling algorithm, leave the modelValues object
## 'mvSaved' as an exact copy of the updated values in model, ## and have no return value. initially, mvSaved contains
## an exact copy of the values and logProbs in the model.
run = function() {
## extract the initial model logProb
model_lp_initial <- getLogProb(model, calcNodes) ## generate a proposal value for target node
proposal <- rnorm(1, model[[target]], scale)
## store this proposed value into the target node.
## notice the double assignment operator, `<<-`,
## necessary because 'model' is a persistent member
## data object of this sampler.
model[[target]] <<- proposal
## calculate target_logProb, propagate the
## proposed value through any deterministic dependents, ## and calculate the logProb for any stochastic
## dependnets. The total (sum) logProb is returned. model_lp_proposed <- calculate(model, calcNodes)
## calculate the log Metropolis-Hastings ratio
log_MH_ratio <- model_lp_proposed - model_lp_initial
## Metropolis-Hastings step: determine whether or ## not to accept the newly proposed value
u <- runif(1, 0, 1)
if (u < exp(log_MH_ratio)) jump <- TRUE
else jump <- FALSE
## if we accepted the proposal, then store the updated ## values and logProbs from 'model' into 'mvSaved'.
## if the proposal was not accepted, restore the values ## and logProbs from 'mvSaved' back into 'model'. if(jump) copy(from = model, to = mvSaved, row = 1,
nodes = calcNodes, logProb = TRUE) else copy(from = mvSaved, to = model, row = 1, nodes = calcNodes, logProb = TRUE)
},
## sampler functions must have a member method 'reset',
## which takes no arguments and has no return value.
## this function is used to reset the sampler to its
## initial state. since this sampler function maintains
## no internal member data variables, reset() needn't
## do anything.
methods = list(
reset = function () {} )
)
## now, assume the existence of an R model object 'Rmodel',
## which has a scalar-valued stochastic node 'x'
## create an MCMC configuration with no sampler functions
mcmcspec <- configureMCMC(Rmodel, nodes = NULL)
## add our custom-built random walk sampler on node 'x', ## with a fixed proposal standard deviation = 0.1 mcmcspec$addSampler(target = 'x', type = 'my_RW',
control = list(scale = 0.1))
Rmcmc <- buildMCMC(mcmcspec) ## etc...
Suppose we have a model with missing data (or a layer of latent variables that can be treated as missing data) and we would like to maximize the marginal likelihood of the model, integrating over the missing data. A brute-force method for doing this is MCEM.
Start by building the model
hybemout <- nimbleModel(code = nimcode,
name = 'hybemout',
constants = nimhycon,
data = nimhydata,
inits = nimhyinits)
hybMCEM <- buildMCEM(model = hybemout, latentNodes = list("I", "s0"),
burnIn = 100, mcmcControl = list(adaptInterval = 20),
boxConstraints = list( list( c("beta", "reporting", "effpropS", "effpropI"),
limits = c(0, Inf) ) ),
buffer = 1e-6)
# The MCEM algorithm allows for box constraints on the nodes that will be optimized,
# specified via the boxConstraints argument. This is highly recommended for nodes that
# have zero density on parts of the real line
CBMCEM(maxit = 20, m1 = 150, m2 = 200)
Set up the Nimble model
CBpfout <- nimbleModel(code = nimcode,
name = 'CBpfout',
constants = nimCBcon,
data = nimCBdata,
check = FALSE)
Build the particle filter
CBpfoutC <- compileNimble(CBpfout)
CBpf <- buildPF(CBpfout, c("I"))
CBpfC <- compileNimble(CBpf, project = CBpfout)
Set your parameters
CBpfoutC$beta = 0.02
CBpfoutC$effpropS = 0.8
CBpfoutC$effpropI = 0.2
CBpfoutC$reporting = 0.5
Obtain log-likelihood
CBpfC$run(m = 5000)
Currently relatively useless as is…
Use this framework to construct your own updater
Truncation of distributions
• \(x \∼ N(0, sd = 10) T(0, a)\), or
• x ∼ T(dnorm(0, sd = 10), 0, a),
• mu1 ~ dnorm(0, 1)
• mu2 ~ dnorm(0, 1)
• constraint_data ~ dconstraint( mu1 + mu2 > 0 )
Lifted Nodes
• The use of link functions causes new nodes to be introduced
• When distribution parameters are expressions, NIMBLE creates a new deterministic node that contains the expression for a given parameter
logProb
• For each variable that contains at least one stochastic node, NIMBLE generates a model variable with the prefix “logProb”
• Can be retrieved with getLogProb
Choice of Samplers
1. If the node has no stochastic dependents, a predictive end sampler is assigned. The end sampling algorithm merely calls simulate on the particular node.
2. The node is checked for presence of a conjugate relationship between its prior distribution and the distributions of its stochastic dependents. If it is determined to be in a conjugate relationship, then the corresponding conjugate (Gibbs) sampler is assigned.
3. If the node is discrete-valued, then a slice sampler is assigned [5].
4. If the node follows a multivariate distribution, then a RW block sampler is assigned for all elements. This is a Metropolis-Hastings adaptive random-walk sampler with a multivariate normal proposal [6].
5. If none of the above criteria are satisfied, then a RW sampler is assigned. This is a Metropolis-Hastings adaptive random-walk sampler with a univariate normal proposal distribution.
Missing Values
See pg 43